substitute confounder
Spatial Deconfounder: Interference-Aware Deconfounding for Spatial Causal Inference
Khot, Ayush, Oprescu, Miruna, Schröder, Maresa, Kagawa, Ai, Luo, Xihaier
Causal inference in spatial domains faces two intertwined challenges: (1) unmeasured spatial factors, such as weather, air pollution, or mobility, that confound treatment and outcome, and (2) interference from nearby treatments that violate standard no-interference assumptions. While existing methods typically address one by assuming away the other, we show they are deeply connected: interference reveals structure in the latent confounder. Leveraging this insight, we propose the Spatial Deconfounder, a two-stage method that reconstructs a substitute con-founder from local treatment vectors using a conditional variational autoencoder (CV AE) with a spatial prior, then estimates causal effects via a flexible outcome model. We show that this approach enables nonparametric identification of both direct and spillover effects under weak assumptions--without requiring multiple treatment types or a known model of the latent field. Empirically, we extend SpaCE, a benchmark suite for spatial confounding, to include treatment interference, and show that the Spatial Deconfounder consistently improves effect estimation across real-world datasets in environmental health and social science. By turning interference into a multi-cause signal, our framework bridges spatial and deconfounding literatures to advance robust causal inference in structured data. Causal inference in spatial settings is critical for science and policy, from estimating the health effects of pollution to evaluating land use, climate interventions, and the spread of infectious disease. Most data in these domains are observational, since large-scale interventions are typically infeasible or unethical, so robust methodology is needed to draw valid conclusions. Y et observational studies in these settings face two fundamental challenges that standard methods rarely address together: (1) spillover (interference), where the treatment at one site affects outcomes at nearby sites, violating the Stable Unit Treatment V alue Assumption (SUTV A), and (2) spatially structured unobserved confounding, where latent fields such as weather or socioeconomic context jointly drive exposures and outcomes.
Deep Causal Reasoning for Recommendations
Zhu, Yaochen, Yi, Jing, Xie, Jiayi, Chen, Zhenzhong
Traditional recommender systems aim to estimate a user's rating to an item based on observed ratings from the population. As with all observational studies, hidden confounders, which are factors that affect both item exposures and user ratings, lead to a systematic bias in the estimation. Consequently, a new trend in recommender system research is to negate the influence of confounders from a causal perspective. Observing that confounders in recommendations are usually shared among items and are therefore multi-cause confounders, we model the recommendation as a multi-cause multi-outcome (MCMO) inference problem. Specifically, to remedy confounding bias, we estimate user-specific latent variables that render the item exposures independent Bernoulli trials. The generative distribution is parameterized by a DNN with factorized logistic likelihood and the intractable posteriors are estimated by variational inference. Controlling these factors as substitute confounders, under mild assumptions, can eliminate the bias incurred by multi-cause confounders. Furthermore, we show that MCMO modeling may lead to high variance due to scarce observations associated with the high-dimensional causal space. Fortunately, we theoretically demonstrate that introducing user features as pre-treatment variables can substantially improve sample efficiency and alleviate overfitting. Empirical studies on simulated and real-world datasets show that the proposed deep causal recommender shows more robustness to unobserved confounders than state-of-the-art causal recommenders. Codes and datasets are released at https://github.com/yaochenzhu/deep-deconf.
Na\"ive regression requires weaker assumptions than factor models to adjust for multiple cause confounding
Grimmer, Justin, Knox, Dean, Stewart, Brandon M.
The empirical practice of using factor models to adjust for shared, unobserved confounders, $\mathbf{Z}$, in observational settings with multiple treatments, $\mathbf{A}$, is widespread in fields including genetics, networks, medicine, and politics. Wang and Blei (2019, WB) formalizes these procedures and develops the "deconfounder," a causal inference method using factor models of $\mathbf{A}$ to estimate "substitute confounders," $\hat{\mathbf{Z}}$, then estimating treatment effects by regressing the outcome, $\mathbf{Y}$, on part of $\mathbf{A}$ while adjusting for $\hat{\mathbf{Z}}$. WB claim the deconfounder is unbiased when there are no single-cause confounders and $\hat{\mathbf{Z}}$ is "pinpointed." We clarify pinpointing requires each confounder to affect infinitely many treatments. We prove under these assumptions, a na\"ive semiparametric regression of $\mathbf{Y}$ on $\mathbf{A}$ is asymptotically unbiased. Deconfounder variants nesting this regression are therefore also asymptotically unbiased, but variants using $\hat{\mathbf{Z}}$ and subsets of causes require further untestable assumptions. We replicate every deconfounder analysis with available data and find it fails to consistently outperform na\"ive regression. In practice, the deconfounder produces implausible estimates in WB's case study to movie earnings: estimates suggest comic author Stan Lee's cameo appearances causally contributed \$15.5 billion, most of Marvel movie revenue. We conclude neither approach is a viable substitute for careful research design in real-world applications.
Comment on "Blessings of Multiple Causes"
Ogburn, Elizabeth L., Shpitser, Ilya, Tchetgen, Eric J. Tchetgen
This scenario is dir ectly analogous to longitudinal causal inference problems with multiple time-varying treatments that conta in time-varying confounders, variables that serve as confounders for some treatments and as mediators for othe r treatments. If there is an unmeasured con-founder for the R -Y relationship (represented by V and the dashed arrows in Figure 1 (a)), then conditioning on R fails to identify the direct effects of A on Y, because it opens a confounding pathway through V . See Hernan and Robins (2020) for an overview of these issues. The answer to the question posed in Appendix B of WB, "Can the c auses be causally dependent among themselves?" is therefore "no." If they are causally depend ent then the deconfounder, by dint of rendering the causes independent, breaks some of the structure among t he causes A, and as was originally established in the time-varying treatment setting, this undermines the identification of joint effects of A on Y by covariate adjustment. Analysis of Lemma 4. This simple argument also serves as a counterexample to Lemm a 4, which states that the deconfounder does not pick up any post-treatment va riables and can be treated as a pre-treatment covariate. This is necessarily false whenever the causes ar e causally dependent among themselves, but it need not hold even if the causes are not causally dependent, s ee below. The proof of Lemma 4 in Appendix I states that "Inferring the s ubstitute confounder Z
Discussion of "The Blessings of Multiple Causes" by Wang and Blei
We begin by congratulating Yixin Wang and David Blei for their thought-provoking article that opens up a new research frontier in the field of causal inference. The authors directly tackle the challenging question of how to infer causal effects of many treatments in the presence of unmeasured confounding. We expect their article to have a major impact by further advancing our understanding of this important methodological problem. This commentary has two goals. We then briefly consider three possible ways to address some of the limitations of the deconfounder method. We first discuss several advantages offered by the deconfounder method. We then examine the assumptions required by the method and discuss its limitations.
Multiple Causes: A Causal Graphical View
Unobserved confounding is a major hurdle for causal inference from observational data. Confounders---the variables that affect both the causes and the outcome---induce spurious non-causal correlations between the two. Wang & Blei (2018) lower this hurdle with "the blessings of multiple causes," where the correlation structure of multiple causes provides indirect evidence for unobserved confounding. They leverage these blessings with an algorithm, called the deconfounder, that uses probabilistic factor models to correct for the confounders. In this paper, we take a causal graphical view of the deconfounder. In a graph that encodes shared confounding, we show how the multiplicity of causes can help identify intervention distributions. We then justify the deconfounder, showing that it makes valid inferences of the intervention. Finally, we expand the class of graphs, and its theory, to those that include other confounders and selection variables. Our results expand the theory in Wang & Blei (2018), justify the deconfounder for causal graphs, and extend the settings where it can be used.
The Medical Deconfounder: Assessing Treatment Effect with Electronic Health Records (EHRs)
Zhang, Linying, Wang, Yixin, Ostropolets, Anna, Mulgrave, Jami J., Blei, David M., Hripcsak, George
Causal estimation of treatment effect has an important role in guiding physicians' decision process for drug prescription. While treatment effect is classically assessed with randomized controlled trials (RCTs), the availability of electronic health records (EHRs) bring an unprecedented opportunity for more efficient estimation. However, the presence of unobserved confounders makes treatment effect assessment from EHRs a challenging task. Confounders are the variables that affect both drug prescription and the patient's outcome; examples include a patient's gender, race, social economic status and comorbidities. When these confounders are unobserved, they bias the estimation. To adjust for unobserved confounders, we develop the medical deconfounder, a machine learning algorithm that unbiasedly estimates treatment effect from EHRs. The medical deconfounder first constructs a substitute confounder by modeling which drugs were prescribed to each patient; this substitute confounder is guaranteed to capture all multi-drug confounders, observed or unobserved (Wang and Blei, 2018). It then uses this substitute confounder to adjust for the confounding bias in the analysis. We validate the medical deconfounder on simulations and two medical data sets. The medical deconfounder produces closer-to-truth estimates in simulations and identifies effective medications that are more consistent with the findings reported in the medical literature compared to classical approaches.
The Blessings of Multiple Causes
Causal inference from observation data often assumes "strong ignorability," that all confounders are observed. This assumption is standard yet untestable. However, many scientific studies involve multiple causes, different variables whose effects are simultaneously of interest. We propose the deconfounder, an algorithm that combines unsupervised machine learning and predictive model checking to perform causal inference in multiple-cause settings. The deconfounder infers a latent variable as a substitute for unobserved confounders and then uses that substitute to perform causal inference. We develop theory for when the deconfounder leads to unbiased causal estimates, and show that it requires weaker assumptions than classical causal inference. We analyze its performance in three types of studies: semi-simulated data around smoking and lung cancer, semi-simulated data around genomewide association studies, and a real dataset about actors and movie revenue. The deconfounder provides a checkable approach to estimating close-to-truth causal effects.